Sentence Object Notation: Multilingual sentence notation based on Wordnet
نویسندگان
چکیده
The representation of sentences is a very important task. It can be used as a way to exchange data interapplications. One main characteristic, that a notation must have, is a minimal size and a representative form. This can reduce the transfer time, and hopefully the processing time as well. Usually, sentence representation is associated to the processed language. The grammar of this language affects how we represent the sentence. To avoid language-dependent notations, we have to come up with a new representation which don’t use words, but their meanings. This can be done using a lexicon like wordnet, instead of words we use their synsets. As for syntactic relations, they have to be universal as much as possible. Our new notation is called STON ”SenTences Object Notation”, which somehow has similarities to JSON. It is meant to be minimal, representative and language-independent syntactic representation. Also, we want it to be readable and easy to be created. This simplifies developing simple automatic generators and creating test banks manually. Its benefit is to be used as a medium between different parts of applications like: text summarization, language translation, etc. The notation is based on 4 languages: Arabic, English, Franch and Japanese; and there are some cases where these languages don’t agree on one representation. Also, given the diversity of grammatical structure of different world languages, this annotation may fail for some languages which allows more future improvements.
منابع مشابه
Recognizing Textual Entailment Via Atomic Propositions
This paper describes Macquarie University’s Centre for Language Technology contribution to the PASCAL 2005 Recognizing Textual Entailment challenge. Our main aim was to test the practicability of a purely logical approach. For this, atomic propositions were extracted from both the text and the entailment hypothesis and they were expressed in a custom logical notation. The text entails the hypot...
متن کاملA Notation for Markov Decision Processes
Many reinforcement learning (RL) research papers contain paragraphs that define Markov decision processes (MDPs). These paragraphs take up space that could otherwise be used to present more useful content. In this paper we specify a notation for MDPs that can be used by other papers. Declaring the use this notation using a single sentence can replace several paragraphs of notational specificati...
متن کاملCanonicity Effect on Sentence Processing of Persian-speaking Broca’s Patients
Introduction: Fundamental notions of mapping hypothesis and canonicity were scrutinized in Persian-speaking aphasics. Methods: To this end, the performance of four age-, education-, and gender matched Persian-speaking Broca's patients and eight matched healthy controls in diverse complex structures were compared via the conduction of two tasks of syntactic comprehension and grammaticality jud...
متن کاملAnalyzing Syntactic Structure
For analyzing the structure of a sentence we will use constituent analysis, a hierarchical model introduced by Chomsky [1]. In this framework, sentences are modeled in the form of a parse tree (see Figure 1), where leaves are lexical items, or terminals, and internal nodes are constituent labels. The constituent labels form grammatical types from the lexical items, and lexical items form the se...
متن کاملParsing Korean based on Dependency Grammar and GULP
This paper presents a parsing algorithm in Prolog using GULP, based on dependency grammar and unification-based grammar.1 It parses declarative sentences of a free-word-order language, Korean. The dependency grammar accepts free order of the words in a sentence. Unification-based features separate the grammar from the parsing algorithm and also simplify the notation of the grammar. GULP (Graph ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1801.00984 شماره
صفحات -
تاریخ انتشار 2018